Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint #1250

Open
wants to merge 2,166 commits into
base: main
Choose a base branch
from

Conversation

PikachuHyA
Copy link
Collaborator

fix #1246

While following the structure of the original code generation, I overlooked the getTypeInfoHelper function, which resulted in unions not being handled appropriately. see #1220 .
In this patch, I have implemented the getTypeInfoHelper function and marked enum and bitint as NYI.

For further details, please refer to the specific commit: llvm/llvm-project@a5986b9#diff-9964470fc240f2670aab82f768ff846ce19ea37a06a47cd7b4599fa3ef92165cL109.

lanza and others added 30 commits November 22, 2024 18:23
1. Add new `cir.vtt.address_point` op for visiting the element of VTT to
initialize the virtual pointer.
2. Implement `getVirtualBaseClassOffset` method which provides a virtual offset
to adjust to actual virtual pointers in virtual base.
3. Follows the original clang CodeGen scheme for the implementation of most
other parts.

@bcardosolopes's note: this is cherry-picked from an older PR from Jing Zhang and
slightly modified for updates: applied review, test, doc and operation syntax.
It does not yet has LLVM lowering support, I'm going to make incremental
changes on top of this.  Any necessary CIR modifications to this design should
follow up shortly too. Also, to make this work I also added more logic to
`addImplicitStructorParam`s` and `buildThisParam`.
as title. 
The generated code is the same as Clang codeden except in a small
discrepancy when GEP:
OG generates code like this: 
`%6 = getelementptr inbounds <4 x i16>, ptr %retval.i, i32 1`
CIR generates a bit differently:
`%6 = getelementptr <4 x i16>, ptr %retval.i, i64 1`
Ptr offest might be trivial because choosing i64 over i32 as index type
seems to be LLVM Dialect's choice.

The lack of `inbounds` keyword might be an issue as
`mlir::cir::PtrStrideOp` is currently not lowering to LLVM:GEPOp with
`inbounds` attribute as `mlir::cir::PtrStrideOp` itself has no
`inbounds`. It's probably because there was no need for it though we do
have an implementation of [`CIRGenFunction::buildCheckedInBoundsGEP`
](https://github.com/llvm/clangir/blob/10d6f4b94da7e0181a070f0265d079419d96cf78/clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp#L2762).

Anyway, the issue is not in the scope of this PR and should be addressed
in a separate PR. If we think this is an issue, I can create another PR
and probably add optional attribute to `mlir::cir::PtrStrideOp` to
achieve it.

In addition to lowering work, a couple of more works:

1. Did a little refactoring on variable name changing into desired
CamelBack case.
2. Changed neon-misc RUN Options to be consistent with other neon test
files and make test case more concise.
…#951)

as title. 
There are two highlights of the PR

1. The PR introduced a new test file to cover neon intrinsics that move
data, which is a big category. This would the 5th neon test file. And
we're committed to keep total number of neon test files within 6. This
file uses another opt option instcombine, which makes test LLVM code
more concise, and our -fclangir generated LLVM code would be identical
to OG with this. It looks like OG did some instcombine optimization.

2. `getIntFromMLIRValue` helper function could be substituted by
[`mlir::cir::IntAttr getConstOpIntAttr` in
CIRGenAtomic.cpp](https://github.com/llvm/clangir/blob/24b24557c98d1c031572a567b658cfb6254f8a89/clang/lib/CIR/CodeGen/CIRGenAtomic.cpp#L337).
The function `mlir::cir::IntAttr getConstOpIntAttr` is doing more than
`getIntFromMLIRValue`, and there is FIXME in the comment, so not sure if
we should just use `mlir::cir::IntAttr getConstOpIntAttr`, either is
fine with me.
…ly (llvm#961)

Close llvm#957

the previous algorithm to convert a multiple dimension array to a tensor
is: fill the value one by one and fill the zero values in conditions.
And it has some problems handling the multiple dimension array as above
issue shows so that the generated values are not in the same shape with
the original array.

the new algorithm here is, full fill the values ahead of time with the
correct element size and full fill the values to different slots and we
only need to maintain the index to write.

I feel the new version has better performance (avoid allocation) and
better readability slightly.
…o llvm intrinsic (llvm#960)

This PR refactored Neon Built in code in clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp a bit to make it cleaner.

Also changed RUNOption of test file clang/test/CIR/CodeGen/AArch64/neon-arith.c to make test more concise, and easy to compare against OG (to compare, just remove -fclangir from llvm gen part of RUN, and the test should still pass)
This is the following up fix for the previous fix
llvm#961

See the attached new test for the reproducer.

Sorry for the initial overlook.
- The flag is the default even for cc1, so make it disable two level deep.
- While here, remove the unnecessary flag disable for pure `-emit-cir`.
…nv-lowering

While here, add more unrecheables to cover some of the current errors, so that
our users can see a clear message instead of a random cast assert of sorts.
This covers at least all crashes seen when removing
-fno-clangir-call-conv-lowering from all tests, there are probably other things
we'll find as we exercise this path.
These are not meant to be used by any other component, make sure it's very
specific.
This is the usual copy-paste-modify from CodeGen, though I changed all
the variable names to conform to our new style. All these functions
should be pulled out as common helpers when we're upstream.
llvm/llvm-project@1d0bd8e
moves a conditional from CodeGen to AST, and this follows suit for
consistency. (Our support for the Microsoft ABI is NYI anyway; this is
just to make things simpler to follow when matching up logic between
CodeGen and CIRGen.)
There is no change to testing functionality. This refacot let those
files have the same Run options that is easier to maintain and extend.
This PR adds initial support for the `__int128` type. The `!cir.int`
type is extended to support 128-bit integer types.

This PR comes with a simple test that verifies the CIRGen and LLVM
lowering of `!s128i` and `!u128i` work.

Resolve llvm#953 .
…d (in CIR + Direct to LLVM) (llvm#966)

Fixes llvm#931
Added type definition in CIRTypes.td, created appropriate functions for
the same in CIRTypes.cpp like getPreferredAlignment,
getPreferredAlignment, etc. Optionally added lowering in LowerToLLVM.cpp
… pipeline

This is causing lots of churn. `-fclangir-call-conv-lowering` is not mature
enough, assumptions are leading to crashes we cannot track with special
messages, leading to not great user experience. Turn this off until we have
someone dedicated to roll this out.
While here add some bits for ptr auth and match OG.
…lvm#990)

LLVM's verifier enforces this, which was previously causing us to fail
verification. This is a bit of a band-aid; the overall linkage and
visibility setting flow needs some work to match the original.
ghehg and others added 23 commits December 12, 2024 09:41
This would facilitate implementation of neon intrinsic `neon_vmax_v` and
`__builtin_elementwise_max`, and potentially future optimizations. CIR
BinOp supports vector type.
Floating point has already been supported by FMaxOp.
This option creates and links all tools against a single libLLVM shared
library (and the corresponding CLANG_LINK_CLANG_DYLIB option also gets
turned on by default). In order for this to work, we need to use
LINK_COMPONENTS instead of LINK_LIBS for all LLVM dependencies, and
clang_target_link_libraries for all Clang dependencies, so that they get
rewritten to use the dylib. Remove llvm_update_compile_flags while I'm
here, since the build macros handle that internally. Before this change,
we'd link against certain LLVM libraries both statically and
dynamically, leading to test failures from duplicate singletons.

The way this works for MLIR is fragile right now: MLIR can create its
own dylib as well but doesn't have build system support for linking
against that dylib. We end up folding the MLIR libraries into
libclang-cpp.so (because all Clang dependencies get pulled into it), but
MLIR tools still link the MLIR libraries statically. It'll still work,
but BUILD_SHARED_LIBS is possibly a better alternative for development.
Distributions like Fedora build their LLVM packages with
LLVM_LINK_LLVM_DYLIB, so we'll want to eventually have good MLIR support
for that setup too.
…lvm#1203)

C/C++ functions returning void had an explicit !cir.void return type
while not having any returned value, which was breaking a lot of MLIR
invariants when the CIR dialect is used in a greater context, for
example with the inliner. Now, a C/C++ function returning void has not
return type and no return values, which does not break the MLIR
invariant about the same number of return types and returned values.
This change keeps the same parsing/pretty-printed syntax as before for
compatibility.
Combined implementaiton with `neon_vaddlvq_u16`
OG somehow implemented them separately but they are no different except
signess and intrinsic name
[OG's
neon_vaddlvq_s16](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13483)
[OG's
neon_vaddlvq_u16](https://github.com/llvm/clangir/blob/2b1a638ea07ca10c5727ea835bfbe17b881175cc/clang/lib/CodeGen/CGBuiltin.cpp#L13449)
The module-level uwtable attribute controls the unwind tables for any
synthesized functions, and the function-level attribute controls them
for those functions. I'll add support for this attribute to the LLVM
dialect as well, but translate it from CIR directly for now to avoid
waiting on the MLIR addition and a subsequent rebase.
llvm#1232)

Match `CodeGenModule::SetLLVMFunctionAttributesForDefinition` so that we
can see what's missing and have a good base to build upon.
I am working on a clangir based solution to improve C++'s safety
(https://discourse.llvm.org/t/rfc-a-clangir-based-safe-c/83245). This is
similar with the previous analysis only approach I proposed, where we
may not care about the lowered code. And this is what I described as
layering problems in
https://discourse.llvm.org/t/rfc-a-clangir-based-safe-c/83245

This is similar with the other issue proposed
llvm#1128. We'd better to emit the
higher level operations and lowering/optimizing it later.

This is also inspired our handling method for VAArg, where we use ABI
information to lower things during the passes. It gives me more
confidence that I am doing things right.
The PR should help us to get rid of NYI 
`NYI UNREACHABLE executed at
clang/lib/CIR/CodeGen/CIRGenExprAgg.cpp:899`
[Relevant OG code
here](https://github.com/llvm/clangir/blob/7fb608d4d1b72c25a1739a1bd66c9024208819cb/clang/lib/CodeGen/CGExpr.cpp#L4767):
I put `HasExplicitObjectParameter` support as a missing feature, which
is a new C++23 feature.
…ge (llvm#1214)

#### The Problem
Let's take a look at the following code:
```
struct A {
~A() {}
};

int foo() { return 42; }
void bar() {
  A a;
  int b = foo(); 
}
```
The call to `foo` guarded by the synthetic `tryOp` looks approximately
like the following:
```
cir.try synthetic cleanup {
   %2 = cir.call exception @_Z3foov() : () -> !s32i cleanup {
      cir.call @_ZN1AD1Ev(%0) : (!cir.ptr<!ty_A>) -> () extra(#fn_attr1)   // call to destructor of 'A'
      cir.yield
    } 
    cir.yield
} catch [#cir.unwind {
    cir.resume 
}] 
cir.store %2, %1: !s32i, !cir.ptr<!s32i>  // CIR verification error
```
The result of the `foo` call is in the `try` region - and is not
accessible from the outside, so the code generation fails with
`operand #0 does not dominate its use` .

#### Solution
So we have several options how to handle this properly. 
1. We may intpoduce a new operation here, like `TryCall` but probably
more high level one, e.g. introduce the `InvokeOp`.
2. Also, we may add the result to `TryOp`.
3. The fast fix that is implemented in this PR is a temporary `alloca`
where we store the call result right in the try region. And the result
of the whole `emitCall` is a `load` from the temp `alloca`.

So this PR is both the request for changes and an open discussion as
well - how to handle this properly. So far I choose the third approach.
If it's ok - I will need to create one more PR with a similar fix for
the aggregated results or update this one.
This PR puts for-loop body, while-loop body, and do-while-loop body in
nested scopes. Allocas in the loop body are now push down to the nested
scope.

Resolve llvm#1218 .
There are two sets of intrinsics regarding Min and Max operations for
floating points

[Maximum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmaximum-llvmmaximumop)
vs
[Maxnum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrmaxnum-llvmmaxnumop)

[Minimum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrminimum-llvmminimumop)
vs
[Minnum](https://mlir.llvm.org/docs/Dialects/LLVM/#llvmintrminnum-llvmminnumop)

[The difference is whether NaN should be propagated when one of the
inputs is
NaN](https://llvm.org/docs/LangRef.html#llvm-maximumnum-intrinsic)
Maxnum and Minnum would return number if one of inputs is NaN, and the
other is a number,
But 
Maximum and Minimum would return NaN (propagation of NaN)

And they are resolved to different ASM such as
[FMAX](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FMAX--vector---Floating-point-Maximum--vector--?lang=en)
vs
[FMAXNM](https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FMAXNM--vector---Floating-point-Maximum-Number--vector--?lang=en)

Both have user cases, we already implemented Maxnum and Minnum
But Maximum and Minimum has user cases in [neon intrinsic
](https://developer.arm.com/architectures/instruction-sets/intrinsics/vmax_f32
)
and [__builtin_elementwise_maximum
](https://github.com/llvm/clangir/blob/a989ecb2c55da1fe28e4072c31af025cba6c4f0f/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp#L53)
…lvm#1235)

Use iterator to visit std::initializer_list field reduce the readability
…d neon_vaddvq_f64 (llvm#1238)

[Neon intrinsic
definition](https://developer.arm.com/architectures/instruction-sets/intrinsics/vaddv_f32).
They are vector across operation which LLVM doesn't currently have a
generic intrinsic about it. As a side note for brainstorm, it might be
worth in the future for CIR to introduce Vector Across type operations
even though LLVM dialect doesn't have it yet. This would help to expose
opt opportunities.
E.g. a very trivial constant fold can happen if we are adding across a
constant vector.
…#1239)

This implementation is different from OG in the sense we chose to use
CIR op which eventually lowers to generic LLVM intrinsics instead of
llvm.aarch64.neon intrinsics
But down to the ASM level, [they are identical
](https://godbolt.org/z/Gbbos9z6Y).
…vm#1242)

This patch follows
llvm#1220 (comment) by
augmenting `CIR_Type` with a new field, `tbaaName`. Specifically, it
enables TableGen support for the `-gen-cir-tbaa-name-lowering` option,
allowing for the generation of `getTBAAName` functions based on the
`tbaaName`. This enhancement enables us to replace the hardcoded TBAA
names in the `getTypeName` function with the newly generated
`getTBAAName`.
This PR adds a bitcast when we rewrite globals type. Previously we just
set a new type and it worked.
But recently I started to test ClangIR with CSmith in order to find some
run time bugs and faced with the next problem.

```
typedef struct {
    int x : 15;   
    uint8_t y;
} S;

S g = { -12, 254};

int main() {    
    printf("%d\n", g.y);
    return 0;
}

```
The output for this program is  ... 127 but not 254!
The reason is that first global var is created with the type of struct
`S`, then `get_member` operation is generated with index `1`
and then after, the type of the global is rewritten - I assume because
of the anon struct created on the right side in the initialization.
But the `get_member` operation still wants to access to the field at
index `1` and get a wrong byte.
If we change the `y` type to `int` we will fail on the verification
stage. But in the example above it's a run time error!

This is why I suggest to add a bitcast once we have to rewrite the
global type.
…2/u32, (llvm#1240)

Co-authored-by: Bruno Cardoso Lopes <bruno.cardoso@gmail.com>
@bcardosolopes bcardosolopes changed the title [CIR][CIRGen][TBAA] Initial TBAA support for union, enum and bitint [CIR][CIRGen][TBAA][NFC] Skeleton for union, enum and bitint Dec 20, 2024
if (!codeGenOpts.PointerTBAA) {
return cir::TBAAScalarAttr::get(mlirContext, types.ConvertType(qty));
}
llvm_unreachable("NYI");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same problems we discussed about llvm_unreachable before is valid for the whole implementation of TBAA, please move to missing features for here, elsewhere and any future work

@bcardosolopes
Copy link
Member

Please make this only change enough to fix #1246, seems to be adding more than necessary for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CIR][CIRGen][TBAA] Struct with Union crashes during CodeGen